Partial leverage

In statistics, high-leverage points are those that are outliers with respect to the independent variables. Leverage points are those that cause large changes in the parameter estimates when they are deleted. Although a leverage point will typically have high leverage, a high leverage point is not necessarily an influential point. The leverage is typically defined as the diagonal of the hat matrix

 H = X(X'X)^{-1}X'.  \,

Partial leverage is used to measure the contribution of the individual independent variables to the leverage of each observation. That is, if hi is the ith row of the diagonal of the hat matrix, the partial leverage is a measure of how hi changes as a variable is added to the regression model.

The partial leverage is computed as:


\left(\mathrm{PL}_j\right)_i = \frac{\left(X_{j\bullet[j]}\right)_i^2}{\sum_{k=1}^n\left(X_{j\bullet[j]}\right)_k^2}

where

j = index of independent variable
i = index of observation
Xj·[j] = residuals from regressing Xj against the remaining independent variables

Note that the partial leverage is the leverage of the ith point in the partial regression plot for the jth variable. Data points with large partial leverage for an independent variable can exert undue influence on the selection of that variable in automatic regression model building procedures.

See also

External links

References

 This article incorporates public domain material from websites or documents of the National Institute of Standards and Technology.